首页> 外文OA文献 >Bagged projection methods for supervised classification in big data
【2h】

Bagged projection methods for supervised classification in big data

机译:大数据监督分类的袋装投影方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Classification methods are widely used for types problems where rules to sort observations into groups are needed. There are many different methods to fit classification models but nothing is universally best. This research develops new classification methods, and visual tools for exploring the algorithms and results introduced in this work. The new classification method is a random forest built on trees using linear combinations of variables, which improves the predictive performance when the separation between classes is in combinations of variables. It is called a projection pursuit random forest (PPF). The benefit of the method is demonstrated using a simulation study, and on a suite of benchmark data. It is implemented in the R package, PPforest, with core functions in Rcpp to improve the computational speed. The process of bagging and combining results from multiple trees produces numerous diagnostics which, with interactive graphics, can provide a lot of insight into the class structure in high dimensions. A web app is designed and developed for this purpose. In the process of developing the PPF some deficiencies were observed in the tree algorithm, PPtree, forming the basic building block. This led to modifications to the algorithm, implemented in the R package, PPtreeExt, and a small web app to help digest differences between various model parameter choices.
机译:分类方法广泛用于类型问题,在这些问题中需要使用规则将观察结果分组。有许多适合分类模型的方法,但没有什么是最好的。这项研究开发了新的分类方法和可视化工具,以探索本文介绍的算法和结果。新的分类方法是使用变量的线性组合在树上建立的随机森林,当类之间的分隔是变量组合时,可以提高预测性能。它称为投​​影追踪随机森林(PPF)。通过仿真研究和一组基准数据证明了该方法的好处。它在R包PPforest中实现,在Rcpp中具有核心功能以提高计算速度。打包和合并来自多个树的结果的过程会产生大量诊断信息,这些诊断信息带有交互式图形,可以提供有关高维类结构的大量见解。为此设计和开发了一个Web应用程序。在开发PPF的过程中,树算法PPtree出现了一些缺陷,形成了基本的构建块。这导致对算法的修改,该算法在R包PPtreeExt和小型Web应用程序中实现,以帮助消化各种模型参数选择之间的差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号